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BRIEF ON APPEAL 



Applicant herewith files this brief on appeal under 37 CFR 41 .37, thereby perfecting the 
Notice of Appeal, which was filed with the United States Patent and Trademark Office on 
February 27, 2008. 

(1) Real Party in Interest 

Adobe Systems Incorporated, the assignee of this patent application, is the real party in 
interest. 

(2) Related Appeals and Interferences 

There are no related appeals or interferences. 

(3) Status of Claims 

Claims 1, 1 1-22, 24, 34-45 and 47-52 are pending. Claims 2-10, 23, 25-33 and 46 are 
canceled. Claims 13-21 and 36-44 are withdrawn. Claims 1, 1 1, 12, 22, 24, 34-35, 45 and 47-52 
are rejected, the rejection of which is appealed. 

(4) Status of Amendments 

There are no unentered amendments. 

(5) Summary of Claimed Subject Matter 

Claim 1 recites a computer-implemented method for generating an audio-based form 
represented electronically as a digital audio file, the audio-based form including one or more data 
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fields (e.g., p. 4, 11. 12-13). The method includes defining zoning information identifying a 
temporal location and temporal dimensions of the one or more data fields of the audio-based 
form (e.g., p. 1 1, 11. 13-14). Structural information is defined including a name for each of the 
one or more data fields and a description of a type of user data expected to be provided for each 
of the one or more data fields (e.g., p. 1 1, 11. 16-18). The audio-based form comprises audio 
signals recording a voice speaking a name of a data field followed by a pause during which a 
user can speak the user data expected to be provided for the data field (e.g., p. 1 1, 11. 3-6). The 
zoning and structural information is encoded in one or more audio signals (e.g., p. 1 1, 11. 11-13). 
The one or more audio signals including the encoded zoning and structural information are 
incorporated into the audio-based form (e.g., p. 1 1, 11. 11-13). 

Claim 1 1 depends from claim 1 and recites that data entered on the form by a user can be 
extracted from the audio-based form based on the encoded zoning and structural information 
without access to a source of zoning or structural information external to the form (e.g., p. 1 1, 11. 
18-20; p. 7, 11. 23-25). 

Claim 12 recites a computer-implemented method for creating an audio-based form 
represented electronically as a digital audio file, the audio-based form including one or more data 
fields (e.g., p. 4, 11. 12-13). The method includes generating a form definition defining the audio- 
based form. The form definition includes zoning information identifying a temporal location and 
temporal dimensions of the one or more data fields and structural information including a name 
for each of the one or more data fields and a description of a type of user data expected to be 
provided for each of the one or more data fields (e.g., p. 1 1, 11. 11-18). The audio-based form 
includes audio signals recording a voice speaking a name of a data field followed by a pause 
during which a user can speak the user data expected to be provided for the data field (e.g., p. 1 1, 
11. 3-6). The zoning and structural information is encoded into one or more audio signals (e.g., p. 
1 1, 11. 11-12). The one or more audio signals including the encoded zoning and structural 
information are incorporated into the audio-based form (e.g., p. 11, 11. 11-12). Audio data 
entered into the audio-based form by a user can be extracted from the audio-based form based on 
the encoded zoning and structural information without access to a source of zoning or structural 
information external to the audio-based form (e.g., p. 1 1, 11. 18-20; p. 7, 11. 23-25). 
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Claim 22 recites a computer-implemented method for creating an audio-based form 
represented electronically as a digital audio file, the audio-based form including one or more data 
fields (e.g., p. 1 1, 11. 4-6 and 11-16). The method includes generating a form definition defining 
the audio-based form, the form definition including zoning information identifying a temporal 
location and temporal dimensions of the one or more data fields (e.g., p. 1 1, 11. 11-16). The 
audio-based form comprises audio signals recording a voice speaking a name of a data field 
followed by a pause during which a user can speak the user data expected to be provided for the 
data field (e.g., p. 1 1, 11. 3-6). The zoning information is encoded in one or more audio signals 
(e.g., p. 1 1, 11. 11-12). The one or more audio signals including the encoded zoning information 
are incorporated into the audio-based form (e.g., p. 1 1, 11. 11-12). Data entered into the audio- 
based form by a user can be extracted from the audio-based form based on the encoded zoning 
information without access to a source of zoning information external to the audio-based form 
(e.g., p. 11,11. 18-20; p. 7, 11. 23-25). 

Claim 24 recites a computer program product, tangibly stored on a computer-readable 
medium, for generating an audio-based form represented electronically as a digital audio file, the 
audio-based form including one or more data fields (e.g., p. 4, 11.12-13). The computer program 
product comprises instructions operable to cause a programmable processor to define zoning 
information identifying a temporal location and temporal dimensions of the one or more data 
fields of the form (e.g., p. 1 1, 11. 13-14). The instructions are further operable to cause a 
programmable processor to define structural information including a name for each of the one or 
more data fields and a description of a type of user data expected to be provided for each of the 
one or more data fields (e.g., p. 1 1, 11. 16-18). The audio-based form comprises audio signals 
recording a voice speaking a name of a data field followed by a pause during which a user can 
speak the user data expected to be provided for the data field (e.g., p. 1 1, 11. 3-6). The zoning 
and structural information is encoded into one or more audio signals (p. 1 1, 11. 11-13). The one 
or more audio signals including the encoded zoning and structural information are incorporated 
into the audio-based form (p. 1 1, 11. 11-13). 

Claim 34 depends from claim 24 and recites that data entered on the audio-based form by 
a user can be extracted from the audio-based form based on the encoded zoning and structural 
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information without access to a source of zoning or structural information external to the audio- 
based form (e.g., p. 11, 11. 18-20; p. 7, 11. 23-25). 

Claim 35 recites a computer program product, tangibly stored on a computer-readable 
medium, for creating an audio-based form represented electronically as a digital audio file, the 
audio-based form including one or more data fields (e.g., p. 4, 11. 12-13). The computer program 
product comprises instructions operable to cause a programmable processor to generate a form 
definition defining the audio-based form. The form definition includes zoning information 
identifying a temporal location and temporal dimensions of the one or more data fields and 
structural information, including a name for each of the one or more data fields and a description 
of a type of user data expected to be provided for each of the one or more data fields (e.g., p. 1 1, 
11. 11-18). The audio-based form comprises audio signals recording a voice speaking a name of a 
data field followed by a pause during which a user can speak the user data expected to be 
provided for the data field (e.g., p. 1 1, 11. 3-6). The zoning and structural information is encoded 
into one or more audio signals (e.g., p. 1 1, 11. 1 1-12). The one or more audio signals including 
the encoded zoning and structural information are incorporated into the audio-based form (e.g. , 
p. 1 1, 11. 11-12). Data entered into the form by a user can be extracted from the audio-based 
form based on the encoded zoning and structural information without access to a source of 
zoning or structural information external to the audio-based form (e.g., p. 1 1, 11. 18-20; p. 7, 11. 
23-25). 

Claim 45 recites a computer program product, tangibly stored on a computer-readable 
medium, for creating an audio-based form represented electronically as a digital audio file, the 
audio-based form including one or more data fields (p. 4, 11. 12-13). The computer program 
product comprise instructions operable to cause a programmable processor to generate a form 
definition defining the audio-based form, the form definition including zoning information 
identifying a temporal location and temporal dimensions of the one or more data fields (p. 11, 11. 
11-16). The audio-based form comprises audio signals recording a voice speaking a name of a 
data field followed by a pause during which a user can speak the user data expected to be 
provided for the data field (p. 1 1, 11. 3-6). The zoning information is encoded into one or more 
audio signals (p. 11, 11. 11-12). The one or more audio signals including the encoded zoning 
information are incorporated into the audio-based form (p. 11, 11. 11-12). The data entered into 
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the audio-based form by a user can be extracted from the audio-based form based on the encoded 
zoning information without access to a source of zoning information external to the form (p. 11, 
11. 18-20; p. 7, 11. 23-25). 

Claim 47 depends from claim 1 and further comprises encoding instructions indicating 
where and how to transmit user data extracted from the audio-based form into one or more audio 
signals. The one or more audio signals including the encoded instructions are incorporated into 
the audio-based form (e.g., p. 10, 11. 23-24). 

Claim 48 depends from claim 12 and further comprises encoding instructions indicating 
where and how to transmit user data extracted from the audio-based form into one or more audio 
signals. The one or more audio signals including the encoded instructions are incorporated into 
the audio-based form (e.g., p. 10, 11. 23-24). 

Claim 49 depends from claim 22 and further comprises encoding instructions indicating 
where and how to transmit user data extracted from the audio-based form into one or more audio 
signals. The one or more audio signals including the encoded instructions are incorporated into 
the audio-based form (e.g., p. 10, 11. 23-24). 

Claim 50 depends from claim 24 and further comprises instructions operable to cause a 
programmable processor to encode instructions indicating where and how to transmit user data 
extracted from the audio-based form into one or more audio signals. The one or more audio 
signals including the encoded instructions are incorporated into the audio-based form (e.g., p. 10, 
11. 23-24). 

Claim 5 1 depends from claim 35 and further comprises instructions operable to cause a 
programmable processor to encode instructions indicating where and how to transmit user data 
extracted from the audio-based form into one or more audio signals. The one or more audio 
signals including the encoded instructions are incorporated into the audio-based form (e.g., p. 10, 
11. 23-24). 

Claim 52 depends from claim 45 and further comprises instructions operable to cause a 
programmable processor to encode instructions indicating where and how to transmit user data 
extracted from the audio-based form into one or more audio signals. The one or more audio 
signals including the encoded instructions are incorporated into the audio-based form (e.g., p. 10, 
11. 23-24). 
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(6) Grounds of Rejection to be Reviewed on Appeal 

Claims 1,11,12, 22, 24, 34, 35, 45 and 47-52 are rejected under 35 U.S.C. § 103(a) as 
being unpatentable over U.S. published patent application no. 2004/0254791A1 ("Coifinan US 
2004/0254791) claiming priority to US provisional application no. 60/451,024 ("Coifman 
Provisional") in view of US published patent application no. 2002/0067854A1 ("Reintjes"). 

(7) Argument 

Claims 1,11,12, 22, 24, 34, 35, 45 and 47-52 are not properly rejected under 35 U.S.C. § 
103(a) as being unpatentable over Coifman in view of Reintjes. 
Claims 1. 11 and 47 
Claim 1 reads as follows: 

A computer-implemented method for generating an audio-based form represented 
electronically as a digital audio file, the audio-based form including one or more 
data fields, the method comprising: 

defining zoning information identifying a temporal location and temporal 
dimensions of the one or more data fields of the audio-based form; 

defining structural information including a name for each of the one or 
more data fields and a description of a type of user data expected to be provided 
for each of the one or more data fields, where the audio-based form comprises 
audio signals recording a voice speaking a name of a data field followed by a 
pause during which a user can speak the user data expected to be provided for the 
data field; 

encoding the zoning and structural information in one or more audio 
signals; and 

incorporating the one or more audio signals including the encoded zoning 
and structural information into the audio-based form. 

First, the applicant would like to point out that the Examiner's primary reference, 
Coifman US 2004/0254791, is the publication of US patent application serial no. 10/791,626, 
which was filed on March 1, 2004, after the filing date of the present application, which was 
filed in November, 2003. However, Coifman US 2004/025791 does claim priority to an earlier 
filed provisional application, having serial no. 60/451,024, i.e., Coifman Provisional. The 
applicant respectfully submits that any new matter included in Coifman US 2004/025791 is not 
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prior art relative to the present application. Accordingly, the Examiner may only rely on matter 
included in Coifman US 2004/025791 that was included in Coifman Provisional. The Examiner 
has cited references to Coifman US 2004/025791, rather than Coifman Provisional. In the 
applicant's arguments below, the applicant addresses the cited portions of Coifman US 
2004/025791, however, does so without conceding that said portions are supported by and 
disclosed in Coifman Provisional. The paragraph citations below are to Coifman US 
2004/025791 unless specifically stated to be citations to Coifman Provisional. 

In any event, the applicant respectfully submits that the Examiner has misconstrued 
Coifman. The Examiner relies on Coifman as disclosing "an audio-based form represented 
electronically as a digital audio file, the audio-based form including one or more data fields", and 
points to Coifman's Figures 2 and 3 and paragraphs 29, 30 and 39. Coifman discloses "a method 
for improving the accuracy of a computerized, speech recognition system" (para. 9). At 
Coifman's paragraph 29 (relied on by the Examiner), Coifman states that "as more text-based 
applications accompany people's use of the internet, for example, such vocal input may be used 
to provide inputs to text field within a particular form, field or web page displayed by an internet 
browser". Coifman then goes on to describe how fields in a computerized electronic form can be 
input by a user's voice using the speech recognition system described (e.g., para. 52). That is, 
the form in Coifman is a visually displayed form and the fields within the visually displayed 
form can be completed by a user speaking the data rather than, for example, typing in the data. 
An example electronic form is shown in Figure 3. See also Coifman Provisional at p. 3, first 
para, lines 8-10. 

By contrast, claim 1 recites an audio-based form represented electronically as a digital 
audio file. The second limitation of claim 1 further clarifies what is meant by "audio-based 
form", and requires that the "audio-based form comprises audio signals recording a voice 
speaking a name of a data field followed by a pause during which a user can speak the user data 
expected to be provided for the data field". Because the form is audio-based, there is no need for 
a visual representation of the form. Advantageously, the form can be completed by someone 
who is visually impaired or otherwise cannot read the language in which the form is prepared, or 
the form can be transmitted over a medium that does not include a means for visual display, e.g., 
over the telephone. Coifman does not disclose such an audio-based form. By contrast, Coifman 
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contemplates using computers (see Figure 1) to complete a computerized electronic form (i.e., a 
visually observable form, such as shown in Figure 3). 

Reintjes also does not disclose an audio-based form. Rather Reintjes discloses a "pen- 
based system that automatically identifies either single page or multi-page forms when data is a 
[sic] written on paper copies of the form" (Abstract). Accordingly, neither reference discloses an 
audio-based form, and in particular, neither reference discloses an audio-based form that 
"comprises audio signals recording a voice speaking a name of a data field followed by a pause 
during which a user can speak the user data expected to be provided for the data field", as 
required by claim 1 . 

Claim 1 further requires that the audio-based form include "zoning information 
identifying a temporal location and temporal dimensions of the one or more data fields of the 
audio-based form". That is, because the form is audio-based, to indicate "where" within the 
audio-based form a field is located requires identifying a temporal location and to indicate the 
"size" of the field requires identifying a temporal dimension. The Examiner relies on Reinjtes as 
disclosing this limitation. Because Reinjtes discloses a pen-based system for use with a paper 
form, it does not make sense that Reinjtes would disclose zoning information that identifies a 
temporal location and a temporal dimension of a field. Not surprisingly, Reinjtes in fact does not 
make such a disclosure. By contrast, Reinjtes discloses several rules that can be used to match 
text boundary boxes (i.e., boxes bounding text input by a user) and form fields (i.e., defined 
fields in a form being filled out by the user). One such rule is to find a "good field/block match 
and if the next temporal block of data is lower on the page or to the right of the previous matched 
block, begin by considering the next field on the same page of the same form as the field just 
matched" (para. 37). That is, once a field/block match is found, if the next block of data entered 
timewise (i.e., "the next temporal block") is lower on the page or to the right, then the rule is 
applied. The use of the word "temporal" in this context is just to indicate the next block of data 
entered in time (i.e., chronologically) by the user entering data. There is no disclosure of a 
temporal location of a field within an audio-based form, or of a temporal dimension of a field in 
an audio-based form. 

Further, claim 1 requires in the third limitation that the zoning information be encoded in 
one or more audio signals, and the fourth limitation requires that said audio signals be 
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incorporated into the audio-based form. There is no disclosure in either Coifman or Reinjtes of 
said zoning information to begin with, and certainly no disclosure of encoding same into one or 
more audio signals. Coifman discloses a computerized electronic form, and therefore does not 
disclose a form represented by audio signals in any event. Similarly, Reinjtes discloses a pen- 
based paper form that is also not represented by audio signals. Since neither disclose encoding 
such zoning information into audio signals, it follows that neither discloses incorporating said 
audio signals into an audio-based form. 

Claim 1 also requires that structural information be encoded into one or more audio 
signals and that said one or more audio signals are incorporated into the audio-based form. 
Structural information includes a name for a data field and a description of a type of user data 
expected to be provided for the data field. The Examiner points to portions of Coifman that 
disclose form fields in a computerized form. However, there is nothing in Coifman that 
discloses encoding structural information about said form fields into audio signals and then 
incorporating said signals into the audio based form (which form is represented electronically as 
a digital audio file). 

Having the zoning and structural information incorporated into the form is an 
advantageous feature of the audio-based form. The form is thereby self-describing. A machine 
processing the form to extract user-data input into the data fields knows where to find the data 
(i.e., by using the zoning information) and what type of data to expect (i.e., by using the 
structural information). Coifman, even if modified by Reintjes as suggested by the Examiner, 
does not disclose such a self-describing audio-based form. 

In short, Coifman and Reintjes do not disclose, either alone or in combination, encoding 
zoning and structural information in audio signals and incorporating the audio signals into an 
audio-based form, where the form is represented electronically as a digital audio file, and 
includes audio signals recording a voice speaking a name of a data field followed by a pause 
during which a user can speak the user data expected to be provided for the data field. 
Accordingly, claim 1 is allowable over Coifman in view of Reintjes. Claims 1 1 and 47 depend 
from claim 1 and are therefore allowable for at least the same reasons. 
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Claims 12 and 48 

Claim 12 recites a method for creating an audio-based form represented electronically as 
a digital audio file, where the audio-based form comprises audio signals recording a voice 
speaking a name of a data field followed by a pause during which a user can speak the user data 
expected to be provided for the data field. The method includes generating a form definition 
defining the audio-based form which includes zoning information identifying a temporal location 
and temporal dimensions of one or more data fields. The form definition further includes 
structural information. The zoning and structural information is encoded into one or more audio 
signals and the audio signals are incorporated into the audio-based form. Audio data entered into 
the audio-based form by a user can be extracted from the audio-based form based on the encoded 
zoning and structural information without access to an external source of information. 

For at least the reasons discussed above in relation to claim 1, Coifman and Reintjes, do 
not disclose, either alone or in combination, an audio-based form, and further do not disclose an 
audio-based form including encoded zoning and structural information incorporated therein. 
Further, there is no disclosure in either Coifman or Reintjes of extracting audio data entered into 
the form by a user based on such encoded zoning and structural information. Accordingly, claim 
12 is allowable over Coifman in view of Reintjes. Claim 48 depends from claim 12, and is 
therefore allowable for at least the same reasons. 

Claims 22 and 49 

Claim 22 recites a method for creating an audio-based form represented electronically as 
a digital audio file. The method includes generating a form definition defining the audio-based 
form, where the audio-based form comprises audio signals recording a voice speaking a name of 
a data field followed by a pause during which a user can speak the user data expected to be 
provided for the data field. The form definition includes zoning information identifying a 
temporal location and temporal dimensions of one or more data fields. The zoning information 
is encoded into one or more audio signals and the audio signals are incorporated into the audio- 
based form. Audio data entered into the audio-based form by a user can be extracted from the 
audio-based form based on the encoded zoning information without access to an external source 
of information. 
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For at least the reasons discussed above in relation to claim 1, Coifman and Reintjes, do 
not disclose, either alone or in combination, an audio-based form, and further do not disclose an 
audio-based form including encoded zoning information incorporated therein. Further, there is 
no disclosure in either Coifman or Reintjes of extracting audio data entered into the form by a 
user based on such encoded zoning information. Accordingly, claim 22 is allowable over 
Coifman in view of Reintjes. Claim 49 depends from claim 22, and is therefore allowable for at 
least the same reasons. 

Claims 24, 34 and 50 

Claim 24 recites a computer program product, tangibly stored on a computer-readable 
medium, for generating an audio-based form represented electronically as a digital audio file and 
including one or more data fields. The computer program product includes instructions operable 
to cause a programmable processor to define zoning information identifying a temporal location 
and temporal dimensions of the one or more data fields, and to define structural information. 
The audio-based form includes audio signals recording a voice speaking a name of a data field 
followed by a pause during which a user can speak the user data expected to be provided for the 
data field. The zoning and structural information is encoded into one or more audio signals that 
are incorporated into the audio-based form. 

For at least the reasons discussed above in relation to claim 1, Coifman and Reintjes, do 
not disclose, either alone or in combination, an audio-based form, and further do not disclose an 
audio-based form including encoded zoning and structural information incorporated therein. 
Accordingly, claim 24 is allowable over Coifman in view of Reintjes. Claims 34 and 50 depend 
from claim 24, and are therefore allowable for at least the same reasons. 

Claims 35 and 51 

Claim 35 recites a computer program product, tangibly stored on a computer-readable 
medium, for creating an audio-based form represented electronically as a digital audio file, 
where the audio-based form comprises audio signals recording a voice speaking a name of a data 
field followed by a pause during which a user can speak the user data expected to be provided for 
the data field. The computer program product includes instructions operable to cause a 
programmable processor to generate a form definition defining the audio-based form which 
includes zoning information identifying a temporal location and temporal dimensions of one or 
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more data fields. The form definition further includes structural information. The zoning and 
structural information is encoded into one or more audio signals and the audio signals are 
incorporated into the audio-based form. Audio data entered into the audio-based form by a user 
can be extracted from the audio-based form based on the encoded zoning and structural 
information without access to an external source of information. 

For at least the reasons discussed above in relation to claim 1, Coifman and Reintjes, do 
not disclose, either alone or in combination, an audio-based form, and further do not disclose an 
audio-based form including encoded zoning and structural information incorporated therein. 
Further, there is no disclosure in either Coifman or Reintjes of extracting audio data entered into 
the form by a user based on such encoded zoning and structural information. Accordingly, claim 
35 is allowable over Coifman in view of Reintjes. Claim 51 depends from claim 35, and is 
therefore allowable for at least the same reasons. 

Claims 45 and 52 

Claim 45 recites a computer program product, tangibly stored on a computer-readable 
medium, for creating an audio-based form represented electronically as a digital audio file. The 
computer program product includes instructions operable to cause a programmable processor to 
generate a form definition defining the audio-based form, where the audio-based form comprises 
audio signals recording a voice speaking a name of a data field followed by a pause during which 
a user can speak the user data expected to be provided for the data field. The form definition 
includes zoning information identifying a temporal location and temporal dimensions of one or 
more data fields. The zoning information is encoded into one or more audio signals and the 
audio signals are incorporated into the audio-based form. Audio data entered into the audio- 
based form by a user can be extracted from the audio-based form based on the encoded zoning 
information without access to an external source of information. 

For at least the reasons discussed above in relation to claim 1, Coifman and Reintjes, do 
not disclose, either alone or in combination, an audio-based form, and further do not disclose an 
audio-based form including encoded zoning information incorporated therein. Further, there is 
no disclosure in either Coifman or Reintjes of extracting audio data entered into the form by a 
user based on such encoded zoning information. Accordingly, claim 45 is allowable over 
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Coifman in view of Reintjes. Claim 52 depends from claim 45, and is therefore allowable for at 
least the same reasons. 

The brief fee in the amount of $5 10 is being paid concurrently herewith on the Electronic 
Filing System (EFS) by way of Deposit Account authorization. Please apply any other charges 
or credits to Deposit Account No. 06-1050, referencing Attorney Docket No. 07844-612001. 

Respectfully submitted, 
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Appendix of Claims 

I . A computer-implemented method for generating an audio-based form represented 
electronically as a digital audio file, the audio-based form including one or more data fields, the 
method comprising: 

defining zoning information identifying a temporal location and temporal dimensions of 
the one or more data fields of the audio-based form; 

defining structural information including a name for each of the one or more data fields 
and a description of a type of user data expected to be provided for each of the one or more data 
fields, where the audio-based form comprises audio signals recording a voice speaking a name of 
a data field followed by a pause during which a user can speak the user data expected to be 
provided for the data field; 

encoding the zoning and structural information in one or more audio signals; and 

incorporating the one or more audio signals including the encoded zoning and structural 
information into the audio-based form. 

I I . The method of claim 1 , wherein data entered on the form by a user can be extracted from 
the audio-based form based on the encoded zoning and structural information without access to a 
source of zoning or structural information external to the form. 

12. A computer-implemented method for creating an audio-based form represented 
electronically as a digital audio file, the audio-based form including one or more data fields, the 
method comprising: 

generating a form definition defining the audio-based form, the form definition including 
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zoning information identifying a temporal location and temporal dimensions of the one or more 
data fields and structural information including a name for each of the one or more data fields 
and a description of a type of user data expected to be provided for each of the one or more data 
fields, where the audio-based form comprises audio signals recording a voice speaking a name of 
a data field followed by a pause during which a user can speak the user data expected to be 
provided for the data field; 

encoding the zoning and structural information into one or more audio signals; and 
incorporating the one or more audio signals including the encoded zoning and structural 
information into the audio-based form; 

wherein audio data entered into the audio-based form by a user can be extracted from the 
audio-based form based on the encoded zoning and structural information without access to a 
source of zoning or structural information external to the audio-based form. 

22. A computer-implemented method for creating an audio-based form represented 
electronically as a digital audio file, the audio-based form including one or more data fields, the 
method comprising: 

generating a form definition defining the audio-based form, the form definition including 
zoning information identifying a temporal location and temporal dimensions of the one or more 
data fields, where the audio-based form comprises audio signals recording m a voice speaking a 
name of a data field followed by a pause during which a user can speak the user data expected to 
be provided for the data field; 

encoding the zoning information in one or more audio signals; and 

incorporating the one or more audio signals including the encoded zoning information 
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into the audio-based form; 

wherein data entered into the audio-based form by a user can be extracted from the audio- 
based form based on the encoded zoning information without access to a source of zoning 
information external to the audio-based form. 

24. A computer program product, tangibly stored on a computer-readable medium, for 
generating an audio-based form represented electronically as a digital audio file, the audio-based 
form including one or more data fields, comprising instructions operable to cause a 
programmable processor to: 

define zoning information identifying a temporal location and temporal dimensions of the 
one or more data fields of the form; 

define structural information including a name for each of the one or more data fields and 
a description of a type of user data expected to be provided for each of the one or more data 
fields, where the audio-based form comprises audio signals recording a voice speaking a name of 
data field followed by a pause during which a user can speak the user data expected to be 
provided for the data field; 

encode the zoning and structural information into one or more audio signals; and 

incorporate the one or more audio signals including the encoded zoning and structural 
information into the audio-based form. 

34. The computer program product of claim 24, wherein data entered on the audio-based 
form by a user can be extracted from the audio-based form based on the encoded zoning and 
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structural information without access to a source of zoning or structural information external to 
the audio-based form. 

35. A computer program product, tangibly stored on a computer-readable medium, for 
creating an audio-based form represented electronically as a digital audio file, the audio-based 
form including one or more data fields, comprising instructions operable to cause a 
programmable processor to: 

generate a form definition defining the audio-based form, the form definition including 
zoning information identifying a temporal location and temporal dimensions of the one or more 
data fields and structural information including a name for each of the one or more data fields 
and a description of a type of user data expected to be provided for each of the one or more data 
fields, where the audio-based form comprises audio signals recording a voice speaking a name of 
data field followed by a pause during which a user can speak the user data expected to be 
provided for the data field; 

encode the zoning and structural information into one or more audio signals; and 
incorporate the one or more audio signals including the encoded zoning and structural 
information into the audio-based form; 

wherein data entered into the form by a user can be extracted from the audio-based form 
based on the encoded zoning and structural information without access to a source of zoning or 
structural information external to the audio-based form. 



45. A computer program product, tangibly stored on a computer-readable medium, for 
creating an audio-based form represented electronically as a digital audio file, the audio-based 
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form including one or more data fields, comprising instructions operable to cause a 

programmable processor to: 

generate a form definition defining the audio-based form, the form definition including 

zoning information identifying a temporal location and temporal dimensions of the one or more 

data fields, where the audio-based form comprises audio signals recording a voice speaking a 

name of a data field followed by a pause during which a user can speak the user data expected to 

be provided for the data field; 

encode the zoning information into one or more audio signals; and 

incorporate the one or more audio signals including the encoded zoning information into 

the audio-based form; 

wherein data entered into the audio-based form by a user can be extracted from the audio- 
based form based on the encoded zoning information without access to a source of zoning 
information external to the form. 

47. The method of claim 1 , further comprising: 

encoding instructions indicating where and how to transmit user data extracted from the 
audio-based form into one or more audio signals; and 

incorporating the one or more audio signals including the encoded instructions into the 
audio-based form. 

48. The method of claim 12, further comprising: 

encoding instructions indicating where and how to transmit user data extracted from the 
audio-based form into one or more audio signals; and 
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incorporating the one or more audio signals including the encoded instructions into the 
audio-based form. 

49. The method of claim 22, further comprising: 

encoding instructions indicating where and how to transmit user data extracted from the 
audio-based form into one or more audio signals; and 

incorporating the one or more audio signals including the encoded instructions into the 
audio-based form. 

50. The computer program product of claim 24, further comprising instructions operable to 
cause a programmable processor to: 

encode instructions indicating where and how to transmit user data extracted from the 
audio-based form into one or more audio signals; and 

incorporate the one or more audio signals including the encoded instructions into the 
audio-based form. 

5 1 . The computer program product of claim 35, further comprising instructions operable to 
cause a programmable processor to: 

encode instructions indicating where and how to transmit user data extracted from the 
audio-based form into one or more audio signals; and 

incorporate the one or more audio signals including the encoded instructions into the 
audio-based form. 
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52. The computer program product of claim 45, further comprising instructions operable to 
cause a programmable processor to: 

encode instructions indicating where and how to transmit user data extracted from the 
audio-based form into one or more audio signals; and 

incorporate the one or more audio signals including the encoded instructions into the 
audio-based form. 
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None. 
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None. 
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